machine learning algorithm
A Comparative Study of Machine Learning Algorithms for Electricity Price Forecasting with LIME-Based Interpretability
Zhao, Xuanyi, Ding, Jiawen, Huang, Xueting, Zhang, Yibo
With the rapid development of electricity markets, price volatility has significantly increased, making accurate forecasting crucial for power system operations and market decisions. Traditional linear models cannot capture the complex nonlinear characteristics of electricity pricing, necessitating advanced machine learning approaches. This study compares eight machine learning models using Spanish electricity market data, integrating consumption, generation, and meteorological variables. The models evaluated include linear regression, ridge regression, decision tree, KNN, random forest, gradient boosting, SVR, and XGBoost. Results show that KNN achieves the best performance with R^2 of 0.865, MAE of 3.556, and RMSE of 5.240. To enhance interpretability, LIME analysis reveals that meteorological factors and supply-demand indicators significantly influence price fluctuations through nonlinear relationships. This work demonstrates the effectiveness of machine learning models in electricity price forecasting while improving decision transparency through interpretability analysis.
- Energy > Power Industry (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.46)
Machine Learning Algorithms in Statistical Modelling Bridging Theory and Application
Rao, A. Ganapathi, Anumula, Sathish Krishna, Singh, Aditya Kumar, M, Renukhadevi, Kumar, Y. Jeevan Nagendra, Tulasi, Tammineni Rama
ABSTRACT It involves the completely novel ways of integrating ML algorithms with traditional statistical modelling that has changed the way we analyze data, do predictive analytics or make decisions in the fields of the data. In this paper, we study some ML and statistical model connections to understand ways in which some modern ML algorithms help'enrich' conventional models; we demonstrate how new algorithms improve performance, scale, flexibility and robustness of the tradi tional models. It shows that the hybrid models are of great improvement in predictive accuracy, robustness, and interpretability. Keyword: Machine Learning, Statistical Modelling, Regression, Classification, Predictive Analytics, Hybrid Models, Dimensiona lity Reduction, Algorithmic Bias, Interpretability, Cross - Disciplinary Applications 1. INTRODUCTION Statistical modelling has very historically been the theoretical framework to understand relationships between variables and make inferences and test hypothes es. Its strength is that it is able to offer interpretations in terms of interpretable parameters and probabilistic assumptions [15].
Machine Learning Algorithm for Noise Reduction and Disease-Causing Gene Feature Extraction in Gene Sequencing Data
Si, Weichen, Ou, Yihao, Tian, Zhen
In this study, we propose a machine learning-based method for noise reduction and disease-causing gene feature extraction in gene sequencing DeepSeqDenoise algorithm combines CNN and RNN to effectively remove the sequencing noise, and improves the signal-to-noise ratio by 9.4 dB. We screened 17 key features by feature engineering, and constructed an integrated learning model to predict disease-causing genes with 94.3% accuracy. We successfully identified 57 new candidate disease-causing genes in a cardiovascular disease cohort validation, and detected 3 missed variants in clinical applications. The method significantly outperforms existing tools and provides strong support for accurate diagnosis of genetic diseases.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States (0.04)
Optimized Approaches to Malware Detection: A Study of Machine Learning and Deep Learning Techniques
Fahim, Abrar, Dey, Shamik, Absur, Md. Nurul, Siam, Md Kamrul, Huque, Md. Tahmidul, Godhuli, Jafreen Jafor
Digital systems find it challenging to keep up with cybersecurity threats. The daily emergence of more than 560,000 new malware strains poses significant hazards to the digital ecosystem. The traditional malware detection methods fail to operate properly and yield high false positive rates with low accuracy of the protection system. This study explores the ways in which malware can be detected using these machine learning (ML) and deep learning (DL) approaches to address those shortcomings. This study also includes a systematic comparison of the performance of some of the widely used ML models, such as random forest, multi-layer perceptron (MLP), and deep neural network (DNN), for determining the effectiveness of the domain of modern malware threat systems. We use a considerable-sized database from Kaggle, which has undergone optimized feature selection and preprocessing to improve model performance. Our finding suggests that the DNN model outperformed the other traditional models with the highest training accuracy of 99.92% and an almost perfect AUC score. Furthermore, the feature selection and preprocessing can help improve the capabilities of detection. This research makes an important contribution by analyzing the performance of the model on the performance metrics and providing insight into the effectiveness of the advanced detection techniques to build more robust and more reliable cybersecurity solutions against the growing malware threats.
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.05)
- North America > United States (0.04)
- Asia > Singapore (0.04)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (1.00)
A Comparative Study of Machine Learning Algorithms for Stock Price Prediction Using Insider Trading Data
Chakravorty, Amitabh, Elsayed, Nelly
The research paper empirically investigates several machine learning algorithms to forecast stock prices depending on insider trading information. Insider trading offers special insights into market sentiment, pointing to upcoming changes in stock prices. This study examines the effectiveness of algorithms like decision trees, random forests, support vector machines (SVM) with different kernels, and K-Means Clustering using a dataset of Tesla stock transactions. Examining past data from April 2020 to March 2023, this study focuses on how well these algorithms identify trends and forecast stock price fluctuations. The paper uses Recursive Feature Elimination (RFE) and feature importance analysis to optimize the feature set and, hence, increase prediction accuracy. While it requires substantially greater processing time than other models, SVM with the Radial Basis Function (RBF) kernel displays the best accuracy. This paper highlights the trade-offs between accuracy and efficiency in machine learning models and proposes the possibility of pooling multiple data sources to raise prediction performance. The results of this paper aim to help financial analysts and investors in choosing strong algorithms to optimize investment strategies.
- North America > United States > Ohio > Hamilton County > Cincinnati (0.04)
- North America > United States > California > Santa Clara County > San Jose (0.04)
- North America > Mexico (0.04)
- Asia > India > Maharashtra > Mumbai (0.04)
- Banking & Finance > Trading (1.00)
- Transportation > Ground > Road (0.34)
Practical Bayesian Optimization of Machine Learning Algorithms
The use of machine learning algorithms frequently involves careful tuning of learning parameters and model hyperparameters. Unfortunately, this tuning is often a "black art" requiring expert experience, rules of thumb, or sometimes brute-force search. There is therefore great appeal for automatic approaches that can optimize the performance of any given learning algorithm to the problem at hand. In this work, we consider this problem through the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). We show that certain choices for the nature of the GP, such as the type of kernel and the treatment of its hyperparameters, can play a crucial role in obtaining a good optimizer that can achieve expert-level performance.
A Comparative Study on Machine Learning Models to Classify Diseases Based on Patient Behaviour and Habits
Musaaed, Elham, Hewahi, Nabil, Alasaadi, Abdulla
In recent years, ML algorithms have been shown to be useful for predicting diseases based on health data and posed a potential application area for these algorithms such as modeling of diseases. The majority of these applications employ supervised rather than unsupervised ML algorithms. In addition, each year, the amount of data in medical science grows rapidly. Moreover, these data include clinical and Patient-Related Factors (PRF), such as height, weight, age, other physical characteristics, blood sugar, lipids, insulin, etc., all of which will change continually over time. Analysis of historical data can help identify disease risk factors and their interactions, which is useful for disease diagnosis and prediction. This wealth of valuable information in these data will help doctors diagnose accurately and people can become more aware of the risk factors and key indicators to act proactively. The purpose of this study is to use six supervised ML approaches to fill this gap by conducting a comprehensive experiment to investigate the correlation between PRF and Diabetes, Stroke, Heart Disease (HD), and Kidney Disease (KD). Moreover, it will investigate the link between Diabetes, Stroke, and KD and PRF with HD. Further, the research aims to compare and evaluate various ML algorithms for classifying diseases based on the PRF. Additionally, it aims to compare and evaluate ML algorithms for classifying HD based on PRF as well as Diabetes, Stroke, Asthma, Skin Cancer, and KD as attributes. Lastly, HD predictions will be provided through a Web-based application on the most accurate classifier, which allows the users to input their values and predict the output.
- North America > United States (0.14)
- Asia > Middle East > Bahrain (0.05)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.69)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.94)
Machine Learning Algorithms for Detecting Mental Stress in College Students
Singh, Ashutosh, Singh, Khushdeep, Kumar, Amit, Shrivastava, Abhishek, Kumar, Santosh
In today's world, stress is a big problem that affects people's health and happiness. More and more people are feeling stressed out, which can lead to lots of health issues like breathing problems, feeling overwhelmed, heart attack, diabetes, etc. This work endeavors to forecast stress and non-stress occurrences among college students by applying various machine learning algorithms: Decision Trees, Random Forest, Support Vector Machines, AdaBoost, Naive Bayes, Logistic Regression, and K-nearest Neighbors. The primary objective of this work is to leverage a research study to predict and mitigate stress and non-stress based on the collected questionnaire dataset. We conducted a workshop with the primary goal of studying the stress levels found among the students. This workshop was attended by Approximately 843 students aged between 18 to 21 years old. A questionnaire was given to the students validated under the guidance of the experts from the All India Institute of Medical Sciences (AIIMS) Raipur, Chhattisgarh, India, on which our dataset is based. The survey consists of 28 questions, aiming to comprehensively understand the multidimensional aspects of stress, including emotional well-being, physical health, academic performance, relationships, and leisure. This work finds that Support Vector Machines have a maximum accuracy for Stress, reaching 95\%. The study contributes to a deeper understanding of stress determinants. It aims to improve college student's overall quality of life and academic success, addressing the multifaceted nature of stress.
- Overview (1.00)
- Research Report > New Finding (0.68)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
- Education > Educational Setting > Higher Education (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.98)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)
Machine Learning Algorithms to Assess Site Closure Time Frames for Soil and Groundwater Contamination
Le, Vu-Anh, Wainwright, Haruko Murakami, Gonzalez-Raymat, Hansell, Eddy-Dilek, Carol
Monitored Natural Attenuation (MNA) is gaining prominence as an effective method for managing soil and groundwater contamination due to its cost-efficiency and minimal environmental disruption. Despite its benefits, MNA necessitates extensive groundwater monitoring to ensure that contaminant levels decrease to meet safety standards. This study expands the capabilities of PyLEnM, a Python package designed for long-term environmental monitoring, by incorporating new algorithms to enhance its predictive and analytical functionalities. We introduce methods to estimate the timeframe required for contaminants like Sr-90 and I-129 to reach regulatory safety standards using linear regression and to forecast future contaminant levels with the Bidirectional Long Short-Term Memory (Bi-LSTM) networks. Additionally, Random Forest regression is employed to identify factors influencing the time to reach safety standards. Our methods are illustrated using data from the Savannah River Site (SRS) F-Area, where preliminary findings reveal a notable downward trend in contaminant levels, with variability linked to initial concentrations and groundwater flow dynamics. The Bi-LSTM model effectively predicts contaminant concentrations for the next four years, demonstrating the potential of advanced time series analysis to improve MNA strategies and reduce reliance on manual groundwater sampling. The code, along with its usage instructions, validation, and requirements, is available at: https://github.com/csplevuanh/pylenm_extension.
Using Sentiment and Technical Analysis to Predict Bitcoin with Machine Learning
Carosia, Arthur Emanuel de Oliveira
Cryptocurrencies have gained significant attention in recent years due to their decentralized nature and potential for financial innovation. Thus, the ability to accurately predict its price has become a subject of great interest for investors, traders, and researchers. Some works in the literature show how Bitcoin's market sentiment correlates with its price fluctuations in the market. However, papers that consider the sentiment of the market associated with financial Technical Analysis indicators in order to predict Bitcoin's price are still scarce. In this paper, we present a novel approach for predicting Bitcoin price movements by combining the Fear & Greedy Index, a measure of market sentiment, Technical Analysis indicators, and the potential of Machine Learning algorithms. This work represents a preliminary study on the importance of sentiment metrics in cryptocurrency forecasting. Our initial experiments demonstrate promising results considering investment returns, surpassing the Buy & Hold baseline, and offering valuable insights about the combination of indicators of sentiment and market in a cryptocurrency prediction model.
- South America > Brazil > São Paulo (0.04)
- South America > Brazil > Roraima > Boa Vista (0.04)
- Europe > Montenegro (0.04)
- Asia > China > Hong Kong (0.04)